Efficient schemes for constructing reliable computing nodes in distributed systems

ABSTRACT

This invention relates to a computing system, a fail-silent node for use in a computing system and a method of organizing information so that a number of microprocessors in a computing node, which are arranged to receive messages from other components in the computing system and to process the received messages so as to transmit the results of this processing to other components in the system, compare the results of their processing and send nothing out from the node unless either all the microprocessors in the mode produce identical results or more than half of the microprocessors in the node produce identical results. This is achieved by manipulating the order in which messages are processed by each microprocessor so as to ensure that each microprocessor in the node receives the same messages, orders these same messages so that messages within each microprocessor are processed in the same order, thus ensuring, if all the microprocessors are functioning correctly, that the same results are produced.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates to a computing node in or for use in a computerprocessing system and particularly a fail-silent computing node.

2. Description of the Prior Art

It is known that replicating computer processing on different computermicroprocessors provides a practical means of constructing computersystems capable of tolerating arbitrary computer processor failures. Acomputing node is composed of a number of conventional computerprocessors on which applications are replicated to achieve tolerance tofailures. Computing nodes are connected via a network.

Typically, individual hardware components do not inherently fail bybecoming silent rather their output is corrupted. For some devices,simple models of their correct behaviour exist and can thus be used as achecking means, eg, memory devices should output exactly the data thatwas originally input to them. In these cases, faults can easily beidentified by the addition to the data of redundant information, eg,parity bits, which can be checked when the data is output. However, forcomplex devices, for which there is no simple correlation between theirinputs and subsequent ouptputs, eg, microprocessors, the easiest errordetection method of adding redundancy is to duplicate the device andcompare the outputs of the two devices.

In typical existing implementations of a fail-silent node, a pluralityof or duplicated microprocessors are closely coupled and run inmicro-synchronisation. Each microprocessor is initialised to anidentical state and then performs identical actions on identical datafor each tick of the system clock. Hence on every clock cycle the dataoutput by the component is identical. The principles underlying the nodearchitectures can be explained by examining FIG. 1 which is adiagrammatic representation of a conventional fail-silent node. Sincethe data streams to be compared are in exact lock-step, a simplehardware comparator (cmp) can be used to check that the data streams areidentical and to prevent any outputs once a discrepancy is detected.Although two replicas are actually running, because they aremicrosynchronised and compared by the dedicated hardware comparator, theapplication running is unaware of the replication and the comparisonsundertaken. When this fail-silent technique is used, the correct anderroneous message sets sent over the network are distinguished by thefact that the only erroneous messages than can be sent are incompletecorrect messages, since the occurrence of a fault during thetransmission of a message can stop transmission within one clock tick.Such incomplete messages are easily identified by the receiver sincethey will contravene the lowest levels of network protocols.

Fail-silent nodes have been used widely, for example, in commercialtransaction computer processing systems. Such nodes have been designedwith the assistance of specialised comparator hardware and clockcircuits. A common (reliable) clock source is used for driving a pair ofprocessors that execute in lock-step, with the outputs compared by a(reliable) comparator; no output is produced, once a disagreement isdetected by the comparator. Note that since only two microprocessors areused within a node to check on each other, the fail silentcharacteristics of a node can be guaranteed only if no more than onemicroprocessor within a node is faulty.

Intuitively, fail silent behaviour ought to mean that a node nevergenerates an erroneous output, i.e., the node can only either generatecorrect outputs or remain silent. However, this is impossible toimplement in practice since output messages take a finite time totransmit, and a fault may occur leading to an error during thetransmission of a message. A definition of fail-silence must include thecase where a message receiver rejects such erroneous messages. Thus atwo-microprocessor node will be said to exhibit fail-silent behaviour inthe following sense: the outputs produced by it (if any) are eithervalid messages or detectably invalid messages; this behaviour isguaranteed so long as no more than one microprocessor in the node fails.

The disadvantages of the above described fail-silent node is as follows;

Firstly, with this type of node every new microprocessor architecture islikely to require substantial design overheads. Secondly, tightlysynchronised processors may not be resilient to transients which mayaffect the microprocessors in identical manner, commonly known as commonmode transient failures. Thirdly there may be market resistance fromcustomers to the use of these highly specialised and customisednon-standard nodes and finally, lock-step synchronisation at very highclock speeds (50-100 MHZ) may well turn out to be difficult orimpossible to achieve.

SUMMARY OF THE INVENTION

It follows from the above that there is a need to provide a fail-silentnode which does not require the microprocessors to be synchronised inlock-step, rather, that the microprocessors are synchronised with oneanother only when sending or receiving information. The microprocessorsof a node function to execute synchronisation and order protocols "tokeep in step". We have achieved this by providing fail-silent nodeswhich have an ordering mechanism so that identical messages in identicalorder are selected for processing thus providing identical outputs. Anode implemented according to the invention does not require dedicatedclock or comparator circuits (the hardware signified by dotted lines inFIG. 1 can thus de dispensed with). Further advantages of the inventioninclude the fact that technology upgrades are easier. This is becausethe principles behind the invention do not change thus the techniquescan be easily ported to any pair of microprocessors; secondly becausethe replicated computations are loosely synchronised, the architectureis likely to be capable of detecting common mode transient failures.This is because transients are unlikely to affect the computations onthe microprocessor pairs in an identical fashion.

According to a first aspect of the invention there is therefore provideda computing system comprising a computing node arranged to receivemessages from other components in the system, to process receivedmessages, and to transmit messages to other components in the system;the computing node comprising:

a) a plurality of microprocessors linked together and arranged toprocess received messages;

b) a means for ordering the messages to be processed such that similarmessages in identical order are selected for processing such thatsimilar messages in which then produce identical outputs; and

c) means for computing the outputs produced by the microprocessors ofthe node and for controlling the output of the node so that nothing isoutput from the node unless all the microprocessors in the node giveidentical output, the node output then being the same as the identicaloutputs.

In an alternative embodiment of the invention the said means forcomparing the outputs produced by the microprocessors of the node andfor controlling the output of the node operates so that nothing isoutput from the node unless more than half of the number ofmicroprocessors in the node give identical output, the node output thenbeing the same as the identical outputs.

According to a second aspect of the invention there is provided afail-silent node in or for use in a processing system comprising;

a plurality of microprocessors having interface means for enablingcommunications with other components in the system, such as for example,other nodes, and a link means to enable communication between saidprocessors in said node, characterised in that; said microprocessorsfurther include;

a) authentication means so that each microprocessor can confirm theintegrity of any message it receives;

b) signature means so that each microprocessor can label a message withits own, preferably unique, signature;

c) ordering means so that each microprocessor can order authenticatedmessages in time-stamped order;

d) diffusion means so that each microprocessor can send messages toother microprocessors; and

e) comparison and control means so that the outputs produced by eachmicroprocessor can be compared; whereby similar messages are processedin identical order and the same outputs are produced by eachmicroprocessor, so that nothing is output from the mode unless all themicroprocessors in the node give an identical output, the node outputthen being then being the same as the identical outputs.

In an alternative embodiment of the invention the said means forcomparing and controlling the outputs produced by the microprocessors ofthe node and for controlling the output of the node operates so thatnothing is output from the node unless more than half of the number ofmicroprocessors in the node give identical output, the node output thenbeing the same as the identical outputs.

According to either aspect of the invention, the ordering meanscomprises the provisions of clock means within each microprocessor whichclock means are synchronised such that a measurable difference betweenreadings of clocks at any instant is represented by a maximum knownconstant. Preferably the clock means is a logical clock.

Alternatively, the ordering means comprises the designation of at leastone microprocessor as;

a Leader microprocessor and at least another of said microprocessors isdesignated as a Follower microprocessor whereby the Leader receivesmessages from outside the node and sends said messages to the Followersuch that the order in which requests are processed is dictated by theLeader microprocessor.

In this ideal embodiment the Leader processes the information and thensends the result of this processing to the Follower so that the Followercan compare this result with its own generated result. In the event thatthe two results are identical, the Follower is adapted to produce amultiple signed message which is transmitted through the system. In theevent that the two outputs are not identical a multiple signed messageis not produced. In addition, the Follower is provided with means whichenables it to monitor messages received from outside the node wherebyfaults can be detected in the Leader.

Preferably still said comparison means of said computing system or saidcomparison and control means of fail-silent node compares incomingmessages with those produced locally so that successful messages can becountersigned by the local microprocessor and the subsequently generatedmultiple signed message can be transmitted through the system. In theevent that the comparison fails, multiple signed messages are notproduced and thus such messages are not sent through the system.

Preferably said computing system or said fail-silent node includesreceiving means which discards duplicate messages.

Preferably said computing system or said fail-silent node includesmicroprocessors which are adapted to receive said messages in parallel.This latter arrangement is not present in the aforementionedLeader/Follower arrangement.

According to a yet further aspect of the invention there is provided amethod for ordering messages to be processed within a fail-silentcomputer node comprising:

a) receiving messages at a microprocessor;

b) authenticating said messages so as to confirm the integrity of same;

c) stamping said messages to be ordered with a time-stamp correspondingto a local clock reading at said microprocessor;

d) signing said messages;

e) diffusing either the signed, time-stamped message or a copy of thissigned, time-stamped message via a link means to other microprocessorsin the node;

f) ordering a plurality of signed, time-stamped messages in time-stampedorder;

g) processing the ordered messages according to their time-stampedorder;

h) signing the processed message output;

i) diffusing either this signed, processed output message or a copy ofthis signed,

j) processed output message via a link means to other microprocessors inthe node; and comparing the output messages in the node and, where apre-determined number of said output messages are identical, releasingsaid output messages from said node.

In a preferred embodiment of the invention said pre-determined numberequals a number equal to all the number of microprocessors in the node.

In an alternative embodiment of the invention said pre-determined numberequals a number equal to more than half of said microprocessors in saidnode.

Preferably said messages are received as said microprocessors in aparallel manner.

The method of ordering involves a process of stabilisation wherebyincoming messages are delayed for a pre-determined length of time beforethey are queued in the time-stamped order of messages.

In the Leader/Follower embodiment of the invention, the pre-determinedlength of time for which incoming messages are delayed in the Leadermicroprocessor equals 0.

In a preferred method, for a two microprocessor fail-silent node, theprocess of ordering or stabilisation involves;

a) diffusing messages according to a First In First Out policy;

b) receiving at time-stamped message with a time-stamp equal to T; and

c) where T is greater than the local clock value, advancing the localclock to a time T+1 and stabilising all messages with a time-stamp lessthan or equal to T; or

d) where T is less than or equal to the local clock value, stabilisingall messages with a time-stamp less than or equal to T.

This preferred method enhances the efficiency of the computing system ornode.

BRIEF DESCRIPTION OF THE DRAWINGS

An embodiment of the invention will now be described by way of exampleonly with reference to the accompanying Figures wherein;

FIG. 1 represents a diagrammatic illustration of the prior art showingin phantom those components made obsolete by the present invention;

FIG. 2 represents a diagrammatic illustration of a fail-silent node inaccordance with the invention;

FIG. 3 represents a diagrammatic illustration of the operation of afail-silent node in accordance with the invention; and

FIG. 4 represents a diagrammatic illustration of a preferred embodimentof a fail-silent node in accordance with the invention and particularlya Leader/Follower fail-silent node in accordance with the invention.

FIGS. 5(A-B) represent a diagrammatic illustration of an improvedtime-based ordering means.

FIG. 6 is a table showing performance measures.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

In this detailed description it is assumed that computer systems havebeen structured to include a number of computer microprocessors thatinteract only by way of messages. Messages are defined as data which issent from one microprocessor to another. Further, it is assumed thatcomputations performed by microprocessors are deterministic, that is tosay, if all the correctly functioning replicas of a process haveidentical initial states then they will continue to produce identicalresponses to incoming messages provided the messages are processed in anidentical order.

The overall node architecture is shown in FIG. 2. Each of the twomicroprocessors (P₁, P₂) has network interfaces (n₁, n₂) for inter-nodecommunication over (redundant) networks; in addition, themicroprocessors are internally connected by a communication link, l, oralternatively, the microprocessor may be linked by external means, forexample by use of interfaces (n₁, n₂). Each non-faulty microprocessor ina node is assumed to be able to sign a message it sends by affixing themessage with its (the microprocessor's) unforgeable signature; it isalso assumed to be able to authenticate any received message, therebydetect any attempts to corrupt the message. For example, digitalsignature based techniques provide such functionality with extremelyhigh probability.

It is necessary that the replicas of computational processes onmicroprocessors within a node select identical messages for processing,to ensure that they produce identical outputs. Identical messageselection can be guaranteed by maintaining identical ordering ofmessages at input ports and ensuring that application processes pick upmessages at the head of their respective input ports. An orderingmechanism is then required to ensure identical ordering if both themicroprocessors are non-faulty.

Each non-faulty microprocessor, as shown in the FIG. 2 arrangement, of anode has the following mechanisms;

a) Diffusion: this takes the messages produced by the applicationprocess running on that microprocessor, signs them and sends them to theother microprocessor of the node for comparison.

b) Comparison: this authenticates all incoming messages from theneighbouring microprocessor; an authenticated message is compared withits counterpart produced locally. If the comparison succeeds, theauthenticated message is countersigned and this doubly signed message istransmitted to destination nodes. A message that cannot be comparedbecause its counterpart does not arrive or a comparison that detects adisagreement indicates a failure. Once a failure is indicated, thecomparison mechanism stops. No further double signed messages areproduced by the node.

c) Receiving: this accepts authentic messages for processing from thenetwork, discarding any duplicates; such valid messages are sent to thelocal ordering mechanism.

d) Ordering: this mechanism negotiates with its counterpart in the othermicroprocessor and attempts to construct identical queues of validmessages for processing by the computation processes.

One known method of achieving ordering requires that the physical clocksof both the microprocessors of a node are synchronised such that themeasurable difference between readings of clocks at any instant isbounded by a known constant.

Essentially, in the known method for ordering, the order process of amicroprocessor stamps a message to be ordered with its local clockreading. A copy of the time-stamped message is signed and sent over thelink to the order process of the other microprocessor in the node. If Tis the time-stamp of the message received from or sent to the orderprocess of the other microprocessor, then the message becomes stable atlocal clock time T+d+e where d is the maximum transmission time takenfor a time-stamped message to travel from one order process to anotherorder process over the link and e is the maximum difference between theclocks of the two microprocessors. A message with time-stamp T will bedesignated stable, if no message with another time-stamp <T will bereceived by an order process. Stable messages are queued at the relevantinput ports in the increasing time-stamp order (with care taken not toqueue a stable message, if its replica has already been queued).

The above operation ensures that the two following properties are met;

Agreement: all the non-faulty replicas of a process receive the sameinput messages;

Order: all the non-faulty replicas have identical input message queuesor ordered message queues.

So, if all the non-faulty replicas of a process of a node have identicalinitial states and replicas always pick messages at the head of queuesfor processing then identical output messages will be produced by them.

We have developed a time-based ordering means for use in a fail-silentnode which will now be explained with reference to FIG. 3. The detailedarchitecture of a node is depicted in FIG. 3, where the major componentsof the system within a microprocessor of a node and their interactionare summarised. The RX₋₋ INT and RX₋₋ EXT processes are responsible forreceiving and authenticating messages from inside and outside the noderespectively. An authentic message coming from outside a node will havetwo distinct signatures (for simplicity, the authentication of internalmessages, received from the other microprocessor in the node, is omittedfrom FIG. 3). Similarly, the TX₋₋ INT and TX₋₋ ENT processes must sendmessages inside and outside the node respectively. The actual computingapplication is represented in FIG. 3 by the Service process. For thepurpose of sending and receiving valid messages, each microprocessormaintains several message queues:

(i) Received Message Queue (RMQ): Contains valid received messagesintended for ordering.

(ii) Processed Message Queue (PMQ): Contains unsigned output messagesproduced by computational processes. These messages must be validated:checked by the comparator before transmission to the final destination.

(iii) External Candidate Message Queue (EMQ): Contains singly signedmessages that have been received for validation.

(iv) Internal Candidate Message Queue (IMQ): Contains unsigned messages,each waiting for a signal message with identical content to arrive inEMQ.

(v) Delivered Message Queue (DMQ): Contains ordered, signed messages tobe delivered to the application process for processing.

(vi) Neighbouring Message Queue (NMQ): Contains signed messages to berelayed to the neighbouring microprocessor of the node. Messages couldeither be for ordering (from the order process) or for validation (fromthe diffuse process).

(vii) Compared Message Queue (CMQ): Contains doubly signed messagesawaiting output.

(viii) Order Message Queue (OMQ): Contains messages relayed by theneighbour microprocessor for ordering.

A message received by RX₋₋ INT process could either be for validation,in which case it is deposited in EMQ, or for ordering, in which case itis deposited in OMQ.

The time-based ordering means has been improved so as to reduce thestability delay. It is of none that physical clocks are replaced bycounters (logical clocks) which are no longer synchronised. The detailsare as follows.

The arrival of a relayed message in OMQ can be used to reduce thestability delay, d+e, as defined previously, imposed by orderingmessages. As each microprocessor is unable to generate a time-stampsmaller than any order that it has previously generated, and messagesare diffused (relayed) according to a First In First Out (FIFO) policy,the time-stamp of a message received defines intervals of time wheremessages can be stabilised earlier than the time d+e. FIG. 5 illustratesthis improvement.

For instance, a message with time-stamp smaller than the local logicaltime is received. As no more messages will be received with time-stampssmaller than the time-stamp of this message (local time-stamps will begreater than the local time and remote messages will necessarily havetime-stamps greater than that of the received message) all previouslyreceived messages, local or remote, with time-stamps smaller than orequal to the time-stamp of the diffused message are designated stable.

Alternatively, a message with time-stamp greater than the local time isreceived. In this case it is certain that no more messages will bereceived with time-stamp smaller than the local time, thus everypreviously received message with time-stamp smaller than or equal to thelocal time is stable. Further the local logical clock is advanced so asto exceed the time-stamp of the received message.

In the above, ordering means the two microprocessors operatesymmetrically, that is to say, the microprocessors execute the samemethod. However, considerable performance improvements can be obtainedby using an asymmetrical approach. We assign different roles for each ofthe microprocessors forming a node. We will term on the Leader and theother the Follower.

The order in which the requests will be processed in the node isdictated by the Leader microprocessor. The Leader selects one of theinput messages for processing and sends a copy to the Follower. Aftertransmission of this message to the Follower, the Leader processes themessage, signs it and afterwards sends a copy of the output message tothe Follower. Meanwhile the Followers receives the message, processes itand waits for the Leader's output message. The Follower then picks upthe Leader's output message and compares it with its own. If the twomessages agree, a double signed message is output, otherwise, theFollower microprocessor stops its activities and no double signedmessages will be output. It is necessary to have communication in theFollower-Leader direction so that the Leader can detect faults in theFollower. Also, the Follower must monitor the messages received fromoutside (omitted for simplicity from FIG. 4) the node in order to detectfaults occurring in the Leader. FIG. 4 shows the architecture for theLeader/Follower fail-silent node. The various processes and queuesperform the same functions as described earlier in this section. In theFollower microprocessor, the number of signatures appended to a messagereceived from the Leader determines its destination (EMQ or DMQ). Twosignatures indicate the message is to be processed.

The Service, Diffuse and Compare processes work in almost the same wayas in the normal fail-silent architecture. The Rx₋₋ Int and Tx₋₋ Intreceive and transmit messages within the node. The Rx₋₋ Ext and Timingprocesses on the Follower are responsible for detecting omission andtiming faults occurring in the Leader. In a correctly functioningsystem, both microprocessors will receive the same request messages fromoutside the node (although not necessarily in the same order). TheFollower's Rx₋₋ Ext process receives each request from outside the nodeand deposits it in the External Received Message Queue (ERMQ) (if a copyof the message is already there, having been deposited by the Timingprocess, then the message time-out is reset) with an associatedtime-out. The Timing process picks up each message in the InternalReceived Message Queue (IRMQ) and resets the time-out associated withits counterpart in the ERMQ (If its counterpart is not there, themessage is placed in ERMQ with an associated time-out). If a time-out`fires`, the Follower assumed that the Leader has failed and ceases itsown activities. As a result, no more double signed messages will beoutput. To solve the problem of detecting omission and timing failuresin the Follower, it suffices to make the Follower end to the Leader thesingle signed messages that are supported to be output. After comparingthis with its own output, the Leader will also output a double signedmessage. If the expected message does not arrive in a `reasonable` time,the Leader will stop sending messages to the Follower, and so no moredouble signed messages will be output.

To calculate the time to process a request for this node, ti isnecessary to analyse the activities in both microprocessors. As theactivities in the Leader and Follower microprocessors are executed inparallel, we cannot simply add the times of the activities executed ineach node. The Service processes are executed in both microprocessors inparallel. However, the Follower has to wait for the request message sentby the Leader before service can begin (the wait time is equal toT_(diffuse)). The Compare process in the Follower microprocessor has towait for the local message produced by the Service process (in theFollower) and for the Leader's counterpart of this message. In general,if request messages and local response messages are sent by the Leaderto the Follower through independent channels, it is likely that theCompare process will never have to wait for the neighbour's responsemessage, as this message will be sent while the Follower is executingService. More formally, the time that the Service process in theFollower will wait for the request message plus the time that theCompare process will wait for the neighbour's response message is equalto T_(diffuse). This T_(FS1/f) is given by the following expression:

    T.sub.FS1/f =T.sub.Rx-Ext +T.sub.diffuse +T.sub.service +T.sub.compare +T.sub.Tx-int

and following the same approach of the previous section, we concludethat:

    T.sub.FS1/f =T.sub.Non-Rep +T.sub.diffuse +T.sub.compare

Since microprocessors in a fail-silent node must exchange at least onemessage per request (the message to be compared) the Leader/Followersoft fail-silent node has near optimal performance for a softwareimplemented fail-silent node.

When discussing the Leader-Follower mechanism described above, it isnecessary to examine the performance of both Leader and Follower sincethey are executing different protocols. The input delay in the Followeris defined to be the time between remove(m,Rx₋₋ Ext) at the Leader andremove(m,DMQ) at the Follower. Hence it reflects the time taken for theLeader to receive the message, relay it to the Follower and have theFollower remove it from DMQ. The output delays of Leader and Followerare significantly different. In the Follower, the output delays aresmaller than in the Leader because the Leader begins to service therequest before the Follower so that when the Follower is ready tocompare its result, the Leader will have already send (or be sending)its response. If the comparison at the Follower is successful, theFollower outputs the compared message before passing its response to theLeader. Hence the output delay at the Leader reflects this additionaltime.

The experimental figures given in Table 1 indicate that adopting theLeader-Follower mechanism within a fail-silent node leads to asignificant improvement in performance. The overhead of using softfail-silent nodes is to produce a delay in response of approximately 3.7ms in a lightly loaded system up to 6 ms when messages are constantlyqueued awaiting service. In either case, the performance of theLeader/Follower fail-silent node is considerably better than either ofthe fail-silent nodes employing an order protocol. If the applicationservices involve lengthy computations then the percentage overheadinvolved in adding replication is extremely small. It is only whencommunication time between nodes outweighs computation time within nodesthat the cost of replication becomes significant.

Modification of the invention to accommodate a fail-silent nodeincluding two or more microprocessors

Fail-silent nodes stop issuing valid messages as soon as a fault isdetected in the node. However, it is possible to buildmulti-microprocessor nodes which mask failures and continue to work inthe presence of failures. An N failure masking node contains Nmicroprocessors and continues to work provided not more than f=(N-1)/2of these processors fail. Failure masking nodes also require an orderingmechanism for incoming messages. However, the performance of theordering mechanism of failure masking nodes can be enhanced by applyingan extension of the logical clock method as described above. Thedescription of N failure masking nodes is more complex since the nodesmust continue to work despite the presence of up to f arbitary processorfailures of a node. Below we describe the method:

Each non-faulty microprocessor P_(i) in an N failure masking nodemaintains a logical clock C_(i) which is first initialised to 1 when thenode is started, and whose value will only increase with the passage ofreal time.

When microprocessor P_(i) receives an authentic external message M (thatis to say, a message with f+1 signatures), it composes an internalmessage comprising of the contents of M, a local logical time-stamp andthe identity of the microprocessor (ie P_(i)). This composed message isdeposited in a message pool called received_(i) and C_(i) is incrementedby 1--this ensures that a non-faulty P_(i) will prepare internalmessages with increasing time-stamps. The message is then signed andsent to all other microprocessors in the node.

When microprocessor P_(i) receives an authentic internal message m withs distinct signatures, s≧1, it deposits m in a pool received_(i), if mis not already there; if m is a new message, C_(j) is set to max{C_(i),m.T+1}. Because of this operation a non-faulty P_(j) will neverprepare and send a different message m' with a time-stamp less than theoriginal message after having received the original message. If thenumber of signatures does not exceed f, the received message iscountersigned and sent to all other microprocessors in the node who havenot signed m. The messages in received_(i), which have time-stampssmaller than the smallest logical clock value in the node are stable andcan be ordered according to their time-stamps.

A non-faulty microprocessor P_(i) detects late messages using time-outsthat are set-up as a result of receiving or sending messages. Theprinciples behind setting these time-outs are as follows:

(i) suppose P_(i) prepared and sends a message m to every other P_(j) atits (physical) clock time t; after its clock time+d, every P_(j) shouldhave C_(j) ≧m.T (time-stamp of m) and hence after its clock time t+2d,P_(i) will not accept any different message m', with m'.T≦m.T;

(ii) suppose that P_(i) j receives a message m at its clock time t; anynon-faulty microprocessor P_(j) that signed m, must have its C_(j) >m.Tat the time it sent m. Therefore, allowing d for possible messagetake-over during transmission, P_(i) should not accept any single signedm' from any P_(j) after its clock time t+d; and

(iii) suppose that in case (ii), P_(i) is a microprocessor that has notsigned m. P_(i) may receive m as late as t+d according to P_(i) 'sclock. So P_(i) must accept any single signed message m' from P_(j) solong as its clock reads less than t+2d.

Similar time-outs for multiple signed messages can be derived, andcareful analysis indicates that each additional signature in a messagewill increase the time-out for accepting that message by 2d. Forexample, (as in case (i)) P_(j) after having prepared and sent m at itsclock time t, should accept any new double-signed message m', withm'.T≦m.T, whilst its clock reads less than t+4d. This complexity isnecessary to prevent faulty microprocessors corrupting the orderedqueues of correct microprocessors.

This scheme, unlike most ordering mechanisms, does not require thatmicroprocessors' physical clocks are synchronised within a known bound.It provides an efficient mechanism for providing ordered message queuesin an N failure masking node. In particular, for the special case, N=3and f=1, the above technique can be optimised, resulting in a reducedordering delay. Further optimisation still is possible when there are nomicroprocessor failures in the node and the communication time betweentwo non-faulty microprocessors is much less than the estimated upperbound d. Since these conditions generally hold in practical systems,these optimisations give valuable performance enhancement to the node.

We claim:
 1. A computing system comprising a computing node arranged toreceive messages from other components in the system, to processreceived messages, and to transmit messages to other components in thesystem; the computing node comprising:a) a plurality of microprocessorslinked together and arranged to process received messages; b) means forordering the messages such that similar messages in identical order areselected for processing by correctly functioning microprocessors whichthen produce identical outputs, and further wherein said ordering meanscomprises a designation of at least one microprocessor as a Leadermicroprocessor and at least another of said microprocessors as aFollower microprocessor whereby the Leader receives messages fromoutside the node and sends said messages to the Follower such that theorder in which messages are processed is dictated by the Leadermicroprocessor; and c) means for comparing the outputs produced by themicroprocessors of the node for controlling the output of the node sothat nothing is output from the node unless all the microprocessors inthe node give identical output, the node output then being the same asthe identical outputs.
 2. A computing system according to claim 1wherein said means for comparing the outputs produced by themicroprocessors of the node and for controlling the output of the nodeoperates so that nothing is output from the node unless more than halfof the number of microprocessors in the node give identical output, thenode output then being the same as the identical outputs.
 3. A computingsystem according to claim 1 wherein the Leader is adapted to process theinformation and then sends the result of this processing to the Followerso that the Follower can compare this result with its own generatedresult and in the event that the two results are identical, the Followeris adapted to produce a multiple signed message which is transmittedthrough the system.
 4. A computing system according to claim 1 whereinthe Follower is provided with means which enables it to monitor messagesreceived from outside the node whereby faults can be detected in theLeader.
 5. A computing system according to claim 1 wherein saidcomparison means is adapted to compare incoming messages with thoseproduced locally so that successful messages can be countersigned by thelocal microprocessor and a subsequently generated multiple signedmessage can be transmitted through the system.
 6. A computing systemaccording to claim 1 wherein said computing system includes receivingmeans which discard duplicate messages.
 7. A method for orderingmessages to be processed within a fail-silent computer nodecomprising:a) receiving messages at a microprocessor; b) authenticatingsaid messages so as to confirm the integrity of same; c) stamping saidmessages to be ordered with a time-stamp corresponding to a local clockreading at said microprocessor; d) signing said messages; e) diffusingeither the signed, time-stamped message or a copy of this signed,time-stamped message via a link means to other microprocessors in thenode; f) ordering a plurality of signed, time-stamped messages intime-stamped order by designating one of said microprocessors as aLeader microprocessor and designating at least one other of saidmicroprocessors as a Follower microprocessor, and arranging for theLeader to receive messages from outside the node and send said messagesto the Follower such that the order in which messages are processed isdictated by the Leader microprocessor; g) processing the orderedmessages according to their time-stamped order; h) signing the processedmessage output; i) diffusing either this signed, processed messageoutput or a copy of this signed, processed message output via a linkmeans to other microprocessors in the node; and j) comparing the messageoutputs in the node and, where a pre-determined number of said messageoutputs are the same, releasing said same message outputs from saidnode.
 8. A method according to claim 7 wherein said predetermined numberequals a number equal to all the number of microprocessors in the node.9. A method according to claim 7 wherein said predetermined numberequals a number equal to more than half of said microprocessors in saidnode.
 10. A method according to claim 7 wherein the method of orderinginvolves a process of stabilization whereby incoming messages aredelayed for a pre-determined length of time before they are queued inthe time-stamped order of messages.
 11. A method according to claim 10wherein two microprocessors are provided in said node and the process ofordering or stabilization involves:a) diffusing messages according to aFirst In First Out policy; b) receiving a time-stamped message with atime-stamp equal to T; and c) where T is greater than the local clockvalue, advancing the local clock to a time T+1 and stabilizing allmessages with a time-stamp less than or equal to T; or d) where T isless than or equal to the local clock value, stabilizing all messageswith a time-stamp less than or equal to T.